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Listing of the Claims: 

1 > (currently amended) A method implemented in a computer system, for clustering a 
string, the string including a plurality of characters, the method including: 
identifying R unique n-grams TY,.r in the string; 
for every unique n-gram T$: 

if [the] a frequency of Ts in a set of n-gram statistics is not greater than a first 
threshold; 

clustering the string with a cluster associated with Ts; 
otherwise: 

for every other n-gram T v in the string Ti...r, except s: 

if the frequency of n-gram Tv is greater than the first threshold: 

if the frequency of an n-gram pair Ts-Tv is not greater than a second 
threshold: 

clustering the string with a cluster associated with the n-gram pair 
Ts-Tv; 

otherwise: 

for every other n-gram T x in the string T] ^ eKC ept s and v: 

clustering the string with a cluster associated with [the] an 
n-gram triple T$»T v «Tx ; 

otherwise: 

do nothing^! [•]) 

where Ti a is a set of n-grams, R is the number of elements in Ti R] and Ts, Tv, 
and Tx are members of Ti...r» 

2, (original) The method of claim 1 further including compiling n-gram statistics. 

3. (original) The method of claim 1 further including compiling n-gram pair statistics. 
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4. (previously presented) A method implemented in a computer system, for clustering a 
plurality of strings, each string including a plurality of characters, the method including: 

identifying unique n-grams in each string; 

clustering each string with zero or more clusters associated with low frequency n-grams 
from that string; and 

clustering each string with zero or more clusters associated with low-frequency pairs of 
high frequency n-grams from that string. 

5. (original) The method of claim 4 further including: 

where a string does not include any low-frequency pairs of high frequency n-grams, 
associating that string with clusters associated with triples of n-grams including 
the pair. 
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6. (currently amended) A method implemented in a computer system, for clustering a 
string, the string including a plurality of characters, the method including: 

identifying R unique n-grams Tj,..Rin the string; 
for every unique n-gram T$: 

if [the] a frequency of Ts in a set of n-gram statistics is not greater than a first 
threshold: 

clustering the string with a cluster associated with Ts; 
otherwise: 

for i ^ 1 to Y: 

for every unique set of i n-grams Tu in the string Tj... r, excepts: 

if the frequency of the n-gram set T$-Tu is not greater than a second 
threshold: 

clustering the string with a cluster associated with the n-gram set 
T s ~Tu; 

if the string has not been associated with a cluster with this value of T s : 
for every unique set of Y+l n-grams T U y in the string Ti,..R vCXCept $: 

clustering the string with a cluster associated with the Y+2 n- 
gram group T$-TuyJM] 
where Tkjr is a set of n-grams^ R is the number of elements in Ti „r , T s and Tii 
are members of Ti.jr» TW is a subset of TY„r, and i and Y are integers. 

7. (original) The method of claim 6 where Y - 1 . 

8. (original) The method of claim 6 further including compiling n-gram statistics, 

9. (original) The method of claim 6 further including compiling n-gram group statistics. 
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10, (currently amended) A computer program, stored on a tangible storage medium, for 
use in clustering a string, the program including executable instructions that cause a computer 
to: 

identify R unique n-grams TY..R in the string; 
for every unique n-gram Ts: 

if (the] a frequency of Ts in a set of n-gram statistics is not greater than a first 
threshold: 

cluster the string with a cluster associated with Ts; 
otherwise: 

for every other n-gram Tv in the string Ti...r s excepts: 
if the frequency of n-gram Ty is greater than the first threshold: 

if the frequency of an n-gram pair Ts-Tv is not greater than a second 
threshold: 

cluster the string with a cluster associated with the n-gram pair Ts- 

T v ; 

otherwise 

for every other n-gram T x in the string Tj ,..r, except s and v* 

cluster the string with a cluster associated with [the] an n- 
gram triple Ts-Ty-Tx; 

otherwise: 

do nothing^.)] 

where T i ^is a set of n-grams % R is the number of elements in Ti ^gjmd 
Tg, T y, and Txare members of T< g 

1L (original) The computer program of claim 10 further including executable instructions 
that cause a computer to compile n-gram statistics. 

12. (original) The computer program of claim 10 further including executable instructions 
that cause a computer to compile n-gram pair statistics. 
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